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Abstract 

Recent studies have explored theoretically the ability of populations of neurons to carry informa- 
tion about a set of stimuli, both in the case of purely discrete or purely continuous stimuli, and in 
the case of multidimensional continuous angular and discrete correlates, in presence of additional 
quenched disorder in the distribution. An analytical expression for the mutual information has 
been obtained in the limit of large noise by means of the replica trick. 

Here we show that the same results can actually be obtained in most cases without the use of 
replicas, by means of a much simpler expansion of the logarithm. Fitting the theoretical model 
to real neuronal data, we show that the introduction of correlations in the quenched disorder 
improves the fit, suggesting a possible role of signal correlations-actually detected in real data- in 
a redundant code. We show that even in the more difficult analysis of the asymptotic regime, an 
explicit expression for the mutual information can be obtained without resorting to the replica 
trick despite the presence of quenched disorder, both with a gaussian and with a more realistic 
thresholded-gaussian model. When the stimuli are mixed continuous and discrete, we find that 
with both models the information seem to grow logarithmically to infinity with the number of 
neurons and with the inverse of the noise, even though the exact general dependence cannot be 
derived explicitly for the thresholded gaussian model. In the large noise limit lower values of 
information were obtained with the thresholded-gaussian model, for a fixed value of the noise and 
of the population size. On the contrary, in the asymptotic regime, with very low values of the noise, 
a lower information value is obtained with the gaussian model. 
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I. INTRODUCTION 

The mutual information, extensively used in the theory of communication , has been 
more recentlyproposed as a measure of the coding capacity of real neurons in the brain (see 



for example P, \4. 5j. tor a. general overview). Information estimates, both from real data 
and in pure theoretical modelling, ideally quantify how efficiently an external observer might 
discriminate between several correlates of behaviour on the basis of the firing of single or 
multiple cells. 

Several theoretical studies have explored the ability of one population of neurons to en- 



code external stimuli, relevant to behaviour 



□ Q □ □ □ 



others have tried to assess 



how efficiently the information is transmitted across several 



avers of a network, which may 



represent distinct stages of processing in some brain area jll. I12I llHI]. 

In most cited works the replica trick has been successfully used in order to derive an 
explicit expression for the mutual information. As we will show in detail in the next section, 
from the formula of the information replicas do appear as a natural methodological choice, 
due to the presence of the logarithm of a sum of conditional probabilities depending on some 
quenched parameters. Yet in the cited works no attempt has been done to verify whether the 
same results can be obtained without resorting to replicas, even in the cases ja, LD, llfj where 
the evaluation could be carried out without any additional assumption of replica symmetry. 

Moreover an exact estimate of the mutual information regardless of the population size 
N and of the noise a is often unachievable, so that an analytical expression can be provided 
only in some limit cases. It might well be that restricting oneself to these cases makes the 
use of replicas redundant or at least an alternative choice to other methods. 

In Partlcular B 3 nave used replica, to s tudy the initia, nnear rise of the —on, 
characterized by small population sizes and large noise in the firing distributions of the 
neurons; this limit would roughly correspond to a high temperature regime for a physical 
system like a spin glass. It is reasonable to think that this limit can be treated and solved 
without replicas, since it is known that annealed and quenched averages coincide in the high 
temperature regime. 

Here we first reconsider the analysis performed in 0,0]; we show that, in the limit when 



cortex of mon 



a type 



the noise a is large and the population size N is small, the same analytical expressions for 
the information can be obtained without the use of the replica trick, by means of a simple 
Taylor expansion of the logarithm, regardless of the nature of the stimulus whether purely 
discrete or mixed continuous and discrete, and both with a gaussian and with a more realistic 
thresholded-gaussian firing distribution. 

In the particular case of mixed continuous angular and discrete stimuli the distribution 
had been parameterized in order to model the firing of neurons recorded from the motor 
tkeys performing arm movements, categorized according to a direction and 
lq . Restricted to this data set, correlations in the preferred direction of 
a given unit across different movement types were actually observed, but the impact of 
such correlations on the information content was not quantified. Thus here we investigate 
theoretically whether correlations introduced in the quenched parameters characterizing the 
distribution can improve the fit of real information curves provided by the model. This would 
suggest that such correlations are information bearing, or better, depress the information, 
leading to a redundant code. 

We move then to the limit of large population sizes and small noise. An attempt to study 
this regime in the presence of purely discrete stimuli by means of replicas was unsuccessful 
in fl. 

Here we show that even in the asymptotic regime, in the case of purely discrete stimuli, 
an analytical expression for the mutual information can be provided without the use of the 
replica trick. 

n 

Another replica free approach to this limit was proposed in [9|, applicable both to the case 
of continuous and discrete stimuli, for a generic firing distribution, provided that it can be 
factorized into single neuron probability density functions. No additional quenched disorder 
was assumed in the distribution. Here we try and apply this method to our particular model 
and we find the assumption under which we retrieve our original approximation. 

Finally, in []| it has been shown that, when limited to the initial linear regime, both 
the gaussian and the thresholded gaussian model provide the same analytical expression for 
the mutual information, except for renormalization of a noise parameter. In particular lower 
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values of the information were obtained with the thresholded gaussian model. We investigate 
this issue in the asymptotic regime comparing the leading term of the information for both 
models. 



II. POPULATION INFORMATION IN THE INITIAL LINEAR REGIME 

A. Coding of purely discrete and mixed continuous and discrete stimuli in a gaus- 
sian approximation 

The firing of neurons emerging from the analysis of real data is characterized by strong 
irregularities and by a wide variability. The choice of a gaussian model as a possible firing 
rate distribution might therefore seem unrealistic and unjustified. Yet with a large sample of 
data it is likely that most irregularities in the distribution average out; their presence if often 
due to a too poor sampling, which in turn biases information estimates, so that smoothing 
with a gaussian or other kernels has become a standard procedure in data analysis (see 
for a review of several regularizing procedures). The advantage in using a gaussian 
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approximation is easier mathematical ana 
expression for the mutual information 



ysis, whic 

J HQ. 



^siSj which allows for the derivation of an explicit 
Moreover, at least in the regime of the 



initial information rise, the use of a more realistic model leads to the same mathematical 
expression for the mUt ual information, exeept for a r—ation of the noise fl. This 
last issue will be discussed more in detail in the next section. 

Let us consider a population of N independent cells which fire to a set of p discrete stimuli, 
parameterized by a discrete variable s, according to a gaussian distribution: 

N 1 



p{{Vi}\s) = II -^=^exp - [(r)i - rilf /2a 2 



X 



where r]i is the firing rate of the i th input neuron, while 77? is its mean rate in response to 
stimulus s. 

The mutual information between the neuronal firing rates {r/j} and the stimuli s reads: 

/ n rf w({^}k)log 2 ^|gM; (2) 



Since p({rji} can be written as J2s P( s )p({Vi}\ s ) ft is eas Y to show that the mutual infor- 
mation can be expressed as the difference between the entropy of the firing rates H({rji}) 
and the equivocation (H({r)i}\s)) s : 

I({Vi},s) = H({ Vi })-(H({ Vi }\s)) s (3) 

with: 

(H({r)i}\s)) s = -£p(«) / Y[d Vi p({ Vi }\s)log 2 p({ Vi }\s); (4) 



H ({Vi}) = -X>(s) / II rf w({^}l s ) lo g2 



£p(s>(MK) 



(5) 



The variables {r^} in p{{f]i}\s f ) are quenched: the sum on the stimuli s' should be per- 
formed, and the logarithm taken, for any fixed configuration {rji}, before integrating on 
The replica trick, devised to perform averages of the partition function across quenched dis- 
order in spin glasses seems to apply also to this case. Yet, contrary to what is found 
in the theory of spin glasses, where the connectivities vary on a much longer time scale with 
respect to the spins and therefore they are quenched, here the presence of quenched disorder 
does not reflect any real distinction between two separate time scales. In fact the same 
sum appears outside the logarithm, and if one were able to explicitly derive p{{f].i\) from 
p{{f]i}\s') there would be no need for replicas to evaluate H({r)i}). 

In the specific case of the distribution (JTJ) p({r]i}) has a functional dependence on the 
configuration of the average rates {qf} and it cannot be explicitly derived except for some 
trivial cases, like: 

Vt = rfi Vs; (6) 

where the information is obviously zero, since p{{f]i}\s) does not depend on s anymore; or 
the opposite noiseless limit, where the cells fire at each stimulus s always with a pattern 
{?7f } and the configurations {qf} across the stimuli do not overlap. In this case, when the 
stimuli are equally likely, so that p(s) = 1/p one has: 

p(ivi}\s) = s({vi} - {%•}); (7) 
KW) = VpE 5 (W-K'»; ( 8 ) 



and since the average configurations {77?} do not overlap it is easy to see that the mutual 
information reaches the upper bound of log 2 p. 

In a more realistic context, the average firing rates {77?} are not kept fixed, reflecting the 
strong variability of the neural activity detected in real data. Therefore, in order to obtain 
an information estimate independent of a particular configuration of the selectivities, the 
variables {77*} are considered quenched and the information must finally be averaged across 
the distribution of {77I}: 

i({vi}) — </({»£})>, (9) 

/({77I}) is the mutual information between the neuronal firing rates {77^} and the stimuli s 
evaluated according to eq.©, for a particular configuration of the mean rates {77*}. 

nh 

This approach has been followed in jo,l2|, where the replica trick has been used to perform 
the analytical evaluation. 

Let us consider the case where quenched disorder is uncorrelated and identically dis- 
tributed across units and across the p discrete correlates: 

oi{ni}) = UM) = [Q{e)} N9 (11) 

i,s 

As already shown in [fj, Q], it is easy to prove that for a population of independent units 
the equivocation (H ({r]i}\s)) s is additive. 

By I({r]i},s) and H({rji}) and (H({r]i}\s)) s in the following I will implicitly mean the 
corresponding quenched averaged quantities. Inserting eq.(JTJ) in eq.(JH) one obtains: 

(H({ Vl }\s)) s = ^(l + ln2na 2 ); (12) 

I turn now to the more difficult evaluation of the rate entropy. Inserting eq.(£Q) in eq.fjSJ) 



and using the equivalence: 



In 



EK«0II 



V2 



2 ^1o 2 



. s' i 

(13) 
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one can integrate out the second term on the rhs in eq. (|13|) ; the result, added to the first term, 
simplifies with the equivocation, eq. (jl2}) ; rearranging all the terms the mutual information 
can be written in the following form: 

nivihs) = -(t^ek*) / n^iK^ - ^ 2/2(72 in 

(14) 

Due to the presence of the sums and of the quenched disorder under the logarithm an 
analytical expression of the mutual information cannot be obtained in the general case; yet 
the evaluation can be performed in some limit cases. We focus here on the initial regime, 
where the number of cells is not large compared to the noise. The asymptotic regime for 
large population sizes will be discussed later on. 

As it has been shown in Q|, one way to get rid of the logarithm and to perform the 
quenched averages and the sums in eq. (|T4"j) is by means of the replica trick 14]; yet, in the 
limit when the noise a is very large and the population size N is not large, a straightforward 
and natural approach consists in performing a simple taylor expansion of the exponentials 
under the logarithm: 
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(15) 



The terms order I /a 4 must be kept because it can be shown that after integration on {rji} 
they will actually give a contribution order 1 /a 2 to the mutual information. 

Inserting the expansion ()15|) in eq. ()14|) and performing the integration on {r/j} one obtains: 

1 1 



In 2 2a 2 



E 



EM s ) (v!) 2 -J2T,p( s )p( s ')v s ivf 



1 N 



s s > l V V. 



(16) 



In 2 2a 2 



where we have used the fact that quenched disorder is uncorrelated and identically dis- 
tributed across stimuli and neurons. 

The averages across quenched disorder and across s,s' can be performed distinguishing 
between the cases s = s',s ^ s' . The final result for the mutual information up to order 
N/a 2 reads: 



The same result has been obtained in p] by means of the replica trick. We have checked 
that the agreement between the two approaches is found also at higher orders in N/a 2 . Yet 
the derivation via the replica trick is clearly longer and more complicated, and a priori less 
controllable than the simple Taylor expansion used here to derive the same results. 

The interest in the coding of purely discrete stimuli rises naturally from the need to 
provide a theoretical framework allowing a direct quantitative comparison with the results 
of real experiments. In fact in a typical experimental protocol neural activity is recorded 
from some areas in the brain, while the subject (human or animal) is presented a discrete 
number of stimuli, or it is trained to perform a discrete number of tasks. 

Yet natural stimuli are multi-dimensional and some of the dimensions can vary in a con- 
tinuous domain. For example a visual stimulus can be parameterized through its colour 
(varying within a discrete set of possible choices) and its orientation (represented by con- 
tinuous angle). It is therefore a primary theoretical interest to extend our results to the 
case where the stimulus may is multi-dimensional and the dimensions may be discrete and 
continuous. 

In the coding of movements categorized according to their direction (continu- 

ous dimension) and their type (discrete dimension) has been studied via direct information 
estimates from real data and pure theoretical modelling. In particular in pj the information 
between the neuronal firing rates and the movements has been evaluated in the limit of large 
noise and finite population size, in presence of quenched disorder and resorting to the replica 
trick. 




(17) 
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In analogy to the model studied in j?| let us consider a population of iV neurons firing 
independently of one another to an external stimulus parameterized by an angle •& and a 
discrete variable s, according to a gaussian distribution: 

N 1 

s) = n 7^f ex ^ - [fo - s )) 2 / 2a 1 ; ( is ) 

Like in eq.Q, rji is the firing rate of the i th neuron; s) is its average firing rate corre- 
sponding to the stimulus s): 

=ei^) + (l-ei)i/; (19) 

^-<.)='? ooB am f— ^J; (20) 

where and are sources of quenched disorder, distributed respectively between and 1 
and between and 2tt and I assume that quenched disorder is uncorrelated and identically 
distributed across neurons and stimuli: 



<>({&) = UQti) = m) N ' ( 21 ) 

i,s 

<?({<.}) = 

Eq.()19|) states that for each discrete correlate s each neuron i fires at an average rate 
modulating with i? around the preferred direction with an amplitude e l s ; alternatively 
the average rate is fixed to a value 77^, independently of 1?, with amplitude 1 — e\. In j^| it 
has been shown that a similar choice for the average rate can effectively reproduce the main 
features of real neurons directional tuning curves. 

The basic definitions ©50,© as well as the initial treatment can be easily generalized 
to the case of structured stimuli, via the replacements: 

E?( s ) — * J2p( s ) / 

s s 

Vt — ► Vi&s) 
(••>, — (23) 



1 v p 



(0^ 



Np 



(22) 
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It is easy to show that the mutual information can be expressed in a form analogous to 
eq.dHJ): 



I{{r H },d®s) = -(± y £p( s ) JdM$) /n%Il e " 
J2p( s ') J ^ / ^e~( (^ ' I(,? ' s))2+(^ ' i(,9 '' s ' ))2 " 2??l * (,? '• s,) ) /2,72 



(Vi-Vi('&,s)) 2 /2a 2 



(24) 



We use again an expansion of the logarithm similar to (|15|k it is then easy to derive the 
analogous of eq. (fTB|) : 

S s' 

From eqa.(|T ^ . (|2^ . (|5T|l . (|2!2jl it is easy to verify that: 



J2 dMs)p($)(ims)} 2 }e^ 

s J 

= (rf) 2 \{A 2 + a 2 - 2aA 1 )( £ 2 ) £ + a 2 + 2a(Ai - a)(e) e 



(25) 
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P 



(Ax - a) 2 ( ^-^> 2 + l -{e 2 ) £ )+a 2 + 2a(A 1 - a){e) t 
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2 2r 
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A, 



2 4r, 



4m 
2m 



a 



(26) 



(27) 



Inserting eqs.([25j h ([26 p in eq.f|25jl one obtains the final expression for the mutual informa- 
tion up to order N jo 2 : 



171 



I{{r)i},$®s) 



0\2 



1 N(r] 



In 2 4cx 2 



p — 1 
P 



2(a-A 1 ) 2 a 2 + 2(A 2 -(A 1 ) 2 ) (e : 



(28) 



In jjj it has been shown that the same expression for the information in linear approx- 
imation is obtained in the case where the preferred directions do not modulate with the 
discrete stimuli: i9® s = $® Vs, i. It can be easily proved that a different contribution to the 
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information would derive in either case from the term (77(1?, 5)77(1?', s')) yet the differing 
term becomes zero when averaged across 

Setting = Vs corresponds to correlating the signal that each neuron carries about 
different stimuli s. Intuitively, since no difference in the preferred orientation can be detected 
any more while looking at distinct correlates s, one would expect an information loss. Such 
a loss is indeed present, as revealed from a detailed evaluation of the quadratic contribution 
in the population size N. We do not report the calculation, which is a trivial application of 
the perturbative theory very much similar to the one performed for the linear approximation 
in N, and which consists in retaining all the terms of order N 2 {rf Y/a A out of the expansion 
of the logarithm. 

The final expression for the second order contributions to the information in either case 
reads: 



1 N 2 (rj 



0\4 



In 2 2(4cx 2 ) 2 
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P_1 (A 1 -A 2 f + (A2; 
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where by we mean the quadratic contribution in the correlated case •d\ s = 1?? Vs,z. 

The same expression as in eq.()29|) has been obtained in [t| by means of the replica trick. 

Figd on the left, shows an example neuron recorded in the SMA area, whose preferred 
direction does not modulate with the discrete dimension (reproduced from Restrictedly 
to this data set such neurons were statistically dominant, even though the significance of 
such observation should be quantified by means of the analysis of other samples of cells. 

On the right we show the theoretical curves in the quadratic approximation correspond- 
ing to the best linear fit, both for the correlated and uncorrelated case. The curves are 
compared to the information as estimated from a population of SMA cells, showing that 
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FIG. 1: (a)Directional tuning for a cell recorded in the right supplementary motor area of a monkey 
performing 4 different types of arm movement. UniLt=unimanual left; UniRt=unimanual right; 
BiSym=bimanual symmetric; BiOpp=bimanual opposite, reproduced from [?| (b) Comparison be- 
tween the theoretical curves, ea. ()29j) . Q28JI . I|29|l. and the information estimated from a sample 
of cells recorded in the right supplementary motor area 16]; m=l;p=2; the distribution g(e) in 
eas.([21|). is just equal to 1/3 for each of the three allowed e of 0,1/2,1; {rf /2a) 2 = 0.64. 

the introduction of correlations in the preferred directions improves the fit. Even far from 
proving that this precise type of signal correlation is the actual mechanism used by SMA 
cells, this result suggests that real cells transmit information firing in a correlated way. 



B. Coding of mixed continuous and discrete stimuli with a thresholded-gaussian 
model 

Till now we have examined the information carried about stimuli characterized by discrete 
or mixed continuous and discrete dimensions, assuming that the firing of different cells is 
independent across cells and gaussian distributed. Yet, as already remarked, both assump- 
tions provide a rough approximation of the firing distribution of real neurons. The most 
unjustified one seems to be the gaussian assumption, since it implies that also negative rates 
have a non zero probability of occurrance; a priori the information rise might result more or 
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less seriously distorted. 

This question has been investigated in where a more realistic model has been proposed, 
truncating the gaussian distribution and adding a delta peak in zero. 

Let us consider once again a population of independent units firing to mixed continuous 
and discrete stimuli •& <g> s, where the single neuron distribution is written as follows: 



s) = -^=exp - Urn - f}0, s)f Qfa) + 2(1 - erf^tf, s)/a))5( Vi )e(-r h ) 



V2 



7T& 



(29) 

Q(x) is the Heaviside step function and ^(fl, s) has already been defined in eq.(fT9*|). erf(x) 
is the error function: 

erf (x) = -^= f dt e~ t2/2 . (30) 

In p| the mutual information I({r)i}, $ <8> s) has been evaluated by means of the replica 
trick, in the limit of large noise a. The interest in this limit arises since the larger is a, the 
larger is the gaussian weight assigned as a whole to negative rates; a consequence might be 
a larger distortion in the information values. 

We show here how the same results can be obtained without the use of the replica trick. 

As usual, the information can be expressed as the difference between the equivocation 
and the output entropy, analogously to eqs.(JlJ),(|HJ) and considering the replacements (J23J)- In 
[3] it is shown that the equivocation can be calculated quite easily as a sum of single neuron 
terms; assuming as usual that quenched disorder in uncorrelated and identically distributed 
across stimuli and neurons, according to eqs. (j21|) . (j2"2*|) . one obtains: 



(H({vm = ^2 { (l + M2™ 2 )) <erf(^, s)/a)) £> , - i^^^^ ^ ) ^ 

e 



+2 ([1 - erf (#0, s)/a)})^ In - - 2 ([1 - erf (fj(#, s)/a)} In [1 - erf (fj(#, s)/a)]) £ ^ \ . (31) 
where we have used the representation of the delta function: 
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/+oo re/2 \ 

dx5(x)F(x) = lim / dx-F(x 
-oo f^0j_ E /2 e 



(32) 



The equivocation diverges when e — > 0, but, as we will show later on, this divergence is 
canceled exactly by a corresponding term in the entropy of the responses, yielding a finite 
result for the mutual information. 

The average across quenched disorder can be performed in the limit of large a and ex- 
panding the error functions in eq.([31|) in powers of \/a . First we evaluate the entropy of the 
responses, since we will show that there is a partial cancellation of terms. 

Considering eq.© for the entropy of the responses and the replacements ()23|) . it is easy 
to show that in the case of the distribution (1291) one obtains: 



#({*}) = -(£*(«) jddp{d) 
1 



n**n 

i 

log 2 



]T f d$' P {$', S ' 



exp — 



( m - fH(0, s)f /2a 2 e( Vi ) + 2(1 - erf(fjifa s)/a))8(r }i )Q(-r Ji ) 



n 



\j2i\a' 



-.exp 



( Vl - fatf, s') f /2a 2 0fa) + 2(1 - erf s / )/(r))«J(7 7i )e(-^) 



>(33) 

E,0° 



Developing the products on the neuron index i and taking into account the symmetry in 
the distribution of different units, eq. (|33j) can be rewritten as follows: 



H({Vi}) 
1 



r N ( N \ r°° k r° N 

5>(a) / UdVi n % 

8 J k=oykj JO i=i J -°° i=k+i 



N 



n 



-.exp — 



{ Vi -^s)f/2a 2 2 N - k I] (l-erffa(0, 



i=k+l 



log 2 

k 



I d$'p{$', s') 



yv 



(77,-^(^,^)72^ 2^" fc I] (l-erf^',*')/^))^ 



i=Jfc+l 



) (34) 
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where we have used the following conventions: 

k 

Y[xi = l if k = 0; (35) 
i=i 

N 

J] = 1 if k — N; (36) 

i=fc+l 

Extracting from the logarithm all the factors which do not depend on s and integrating 
them on {r/i} it is easy to show that the entropy of the responses can be expressed as follows: 



H({Vi}) = E / 
iV"\ fc 



N 

E 

fc=i 

iV-l 

+ E 

fc=0 
N 

+ E 



it 



2 In 2 



1 + ln27nx 2 ) (erf (77^, s)/(t)}^ (1 - erf (77(7?, s)/a)) 



N-k 



N ) N m * ln(e/2) (evfm s)/a))^ (1 - erf (fj(V, s)/a))^ k 



In 2 



N 21b s)/a2eii m 8)/a) \*> (1 - erf (^> °)M)Z* 



N I 

+ E 

fe=i 



N k [m 8)/yfaoe*W*) (erf (1 - erfffltf, 



V 



21n2 



A? 



E / ddp{$, s) 



N \ /-oo 



Ei/ n^n^ 



/v 



(t*-ft(M) J] (l-erf(t*(0 >S )/*)) 

i=fe+l 



log 2 



E / s') ( II ea; P [2»fcW, s') - fjW, s')/2a 2 } J] (1 " erf(W, s')/a)) 

i=l ' i=k+l 



137) 

E,0° 



where we have used the equality (|3~2"j) and we have assumed that quenched disorder is uncor- 
rected across units and stimuli. 

Subtracting the equivocation, eq. (|3~T]) from the entropy of the responses, eq.(JHZJ), it is 
easy to see that after summation on k, the first two terms in eq. (j37J) simplify with analogous 
terms in eq. (j31|) . so that the logarithmic divergence for e — > cancel out. 
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Finally the mutual information can be rewritten as follows: 



I({Vi},#®s) = £ I d$p{$,s) 

s •* 

i N / M e -a^)iv^\ 

In 2 \ V2ttv 



ln2 



([1 - erf(7K#, s)I<t)\ In [1 - erf {f}(&, s)/a)}) £ , 



21n2\ a 2 



erf (77(1?, s)/cr) 



N 

E 



iV \ roo 



n**n 



A' 



fc=0 \ k } J° i=l i=l v2 



=exp — 



7TO" 



/2a 2 J] (l-erf(^(^, S )/a)) 



i=fc+l 



log 2 



k N 

5>(s') I d#p(#) ( J] exp [2 V iW, s') - s')/2a 2 ] JJ (1 - erf fa (0', s')/^)) 

i=Jfe+l 



{38) 



Eq. ()38j) constitutes the final expression for the mutual information in the general case. To 
proceed with the analytical evaluation one must now resort to some approximation. As 
suggested in |7fl in the limit when the noise a is large one can expand the error functions in 
eq. 



erf(x) — - H — i= x + o{x 2 ) 
2 V27T 



(39) 



A Taylor expansion can be performed also for the exponentials under the logarithm. One 
has: 



log 2 
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log 2 
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Developing all the products one can expand the logarithm again in powers of 1/cr, very 
much similarly to what has been shown in eq. (jl5j) . The result is then integrated on {r]i} and 



7T 



a* 



(40) 
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the binomial sums in eq. (|38j) can be performed. Details about the evaluation are given in 
appendix |Bj 

Finally the mutual information can be written as follows: 



I({7/i},0®s) 

1 N 



\n2 2a 2 
1 

ln2 



1 1 

2 + H 



HI>Mp(s') fdff I rM'p(ti)p(d') 

.9 ./ J J 



N(r] 
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4a 2 



P 



P 



-2 (a - A,) 2 a 2 + 2 (A 2 



(41) 



where we have used eqs.(l23 ]) . (f27)J) and Ai,A 2 and a have been defined in eq. (|2Tj) . 

This result equals the expression obtained using replicas in |7j, showing that even with 
a more complicated distribution, other than the simple gaussian model, the evaluation can 
be carried out via a simple Taylor expansion of the logarithm, and no significant advantage 
derives from the use of the replica trick in the limit case of large noise. 

Comparing eqs.(jJTJ) and (J25J) one can notice that limitedly to the case of large noise, the 
effect of thresholding the gaussian distribution with respect to the information is merely a 
renormalization of the noise for a factor 1 / Jl/2 + 1/ir. In [7] it has been shown that this 



renormalization effect holds at higher orders in 1/a 2 



III. POPULATION INFORMATION IN THE ASYMPTOTIC REGIME OF 
LARGE N AND SMALL a 

We turn now to the analysis of the asymptotic regime in the information curve, for a large 
number of neurons. A first attempt to solve this limit in the case of independent gaussian 
units and discrete stimuli, as in eq.flf has been done in j6fl by means of the replica trick. Yet, 
probably due to some too strong approximation in summing on replicas, the final analytical 
expression was incorrect, according to the authors. 

The asymptotic behaviour was then studied in -9] distinctly in the case of discrete and 
continuous stimuli, and for a generic distribution of independent units, yet in absence of 
additional quenched disorder. We try here to go further and study the case where quenched 
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disorder is present, as in distribution (JTJ, and mixed continuous and discrete dimensions 
characterize simultaneously the stimulus structure. We focuse first on the simpler case of 
purely discrete stimuli. In this context we compare an approach which is equivalent in the 
nature of the approximation to the one presented in p, yet replica free, to the approach 
presented in jjj. We show in detail under which approximation the two approaches provide 
the same result. 



A. Coding of discrete stimuli in a gaussian approximation 

Let us reconsider eq. (jl4|) . When a becomes very small the probability density p(rji\s) can 
be approximated by a 5-function: 



-{vi-vtY ''/2a 2 



s (vi - vt) ; 



(42) 



This approximation, which corresponds to freezing the quenched disorder represented by the 
variables {r/j} under the logarithm, has been used in p after getting rid of the logarithm 
itself by means of the replica trick. Under this approximation the integration on {r]i} can 
be performed and the mutual information can be rewritten as follows: 



In 2 



5^p(s)ln 



&<y)il e 



log2P ~ hT2 ln 



(43) 



where upper bound log 2 p derives from the term with s = s' in the sum on s' and we have 
used p(s) = const. = 1/p. 

Since a is small one can expand the logarithm: 



In 



Y[e-^'^ )2/2<t2 



2 ^ z - 

s'^s s"^s ij 



(44) 



18 



In the appendix we show that inserting this expansion in eq.([43j) one can perform the 
quenched averages and derive an explicit expression for the mutual information. The final 
result reads: 



H{Vi},s) ~ log 2 p 



1 - 



P 



S x (V2^(rh) N - (p - 2) S 2 (2vra 2 / 2 ) 



N 



(45) 



log 2 p In 2 

where we have considered the leading term order a and the first correction of order a 2 , and 
one has: 



1 



ifc+i 



\k+l 



k=l 



£ 

fe=i 



k 



di]Q 2 (r}); 



dr]g 3 (rj); 



(46) 



When the noise goes to zero and the population size is large the information reaches the 
upper bound of log 2 p. 

Fig|21 shows the mutual information according to eq. ()45|) as a function of the population 
size and for different values of the noise a. Circles and stars are respectively for the full 
mutual information with both the leading and the correction terms in eq. (|4"Hj) and with only 
the leading term of order a . As it is evident from the plot, the larger a, the slower the 
approach to the ceiling, and the larger the weight acquired by the correction term of order 
a 2N . 

An alternative replica-free method to study the asymptotic information regime has been 

n 

proposed in [9J. In principle the method looks quite efficient and moreover it can be applied 
both to continuous and to discrete stimuli. 

r-i 

Yet no additional quenched disorder affected the distributions considered in [9f. We try 
now to apply the method to our particular coding scheme. 

Let us reconsider eq.(J2J) for the mutual information. With a change of variables it can be 
rewritten: 



J2p(s')exp (X s 



x " = ^ 111 (SB) = jf ? w - )( "- + - 2 "- )/2j • (48) 
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FIG. 2: Mutual information as in ea.l|45|). as a function of the population size N. Different curves 
correspond to different values of the noise a; circles are for the information with only the leading 
term of order a N in eq. (|45|) , while stars are for the full information with also the correction of order 

p{{X s ,}) = f d{ Vi }p({Vi}\s) II HXs>-X s , ({ Vi },s,s')); (49) 

s'/s 

where in deriving the explicit expression for X s i we have used the distribution This 
change of variable allows to move the quenched disorder from inside the logarithm to the 
distribution p({X s >}). 

Using the integral representation for the 5 function and integrating on {rji} eq. (|49|) can 
be rewritten as follows: 



P({X S >}) = J n^expf-.n* 



n n ex p - ritm" - vn/^) ; m 

We notice that when grows large and a becomes small the argument in the second expo- 
nential is of order 1/N with respect to the first one: a relatively flat gaussian is multiplied 
times a strongly oscillating periodic function. Thus we put the second function equal to one 
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obtaining: 



We notice that: 



s'^s \ JV i 



; (51) 



= -/^MWW>(ggJg) =-^IW; (52) 

where -KX(s'||s) is the Kullback-Leibler divergence between the distributions p({7/i}|s') and 
p({r]j}|s). Thus under this approximation the distribution p({X s i}) factorizes into a product 
of delta-functions centered on the mean value of X s * W. 

Inserting eq.([51|) in the expression for the mutual information, eq.(j47j) . it is easy to show 
that integrating on {AT S /} one reobtains eq. (|43|) . Thus, approximating the variables {AT S /} 
with the mean of the distribution, as in eq. (j51j) . corresponds to the 5-function approximation, 
eq.(g21). 

A more accurate estimate might be obtained calculating the correction given by the second 
exponential in eq.([50|). that we had previously neglected. Yet, integration on {Y s >} would 
lead to the introduction of matrices which depend on the quenched disorder {r}?}, which 
therefore must be averaged out first, in order to derive (p({X s /r})) . The average might be 
performed specifying the distribution of the quenched disorder g(rj), but even in this case, 
integrating out the variables {rjf} introduces a non trivial dependence on the variables {Y s >} 
which in turn can be integrated out by means of further ansatz. Details about the evaluation 
of the correction and of the results will be published elsewhere 1^|. 



B. Coding of mixed discrete and continuous stimuli: gaussian vs thresholded gaus- 
sian model 

As we have shown in the previous section, in the large a limit both the technical evaluation 
and the final expression for the mutual information are formally the same whether the 
stimuli are discrete or continuous. On the other hand in the asymptotic regime of small a 
a qualitative difference exists between the case where the stimuli are discrete or continuous, 
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in that in the former case the information is bounded by the entropy of the stimulus set, 
while in the latter the information grows to infinity as the noise goes to zero, or if the 
number of neurons grows to infinity for a finite noise. Here we calculate the expression of 
this asymptotic growth, which corresponds to the upper bound in the case of purely discrete 
stimuli. We compare the two cases of gaussian and thresholded gaussian models, in order 
to assess whether, as in the large a regime, a plain relationship like a renormalization of the 
noise links the two expressions. 

Let us reconsider the distribution (|18|). The expression for the mutual information is 
given by eq.(j2U), that we recall: 



i(fa},0®s) = -/^$>(s) jm v {f) /n%Il e 

J2p( s ') y'^ / ]^e~( ( * Ws))2+(^ ' ^(,? '' ;s ' ))2 " 2??l * (,? '' s ' ) ) /2,7 



-(Vi-rii(&,s)) 2 /2a 2 



(53) 



In analogy with the approximation ()42|) . in the limit when a becomes very small we use: 

^_ xffa-foOM)); (54) 

Under this aproximation we obtain in analogy to eq. ()43j) : 



mz \ J , J ■ 



i(tf, S )] 2 /2o- 2 



)(55) 



In the previous section we have seen that in the case of p purely discrete stimuli the upper 
bound, log 2 p, was given by the term with s — s' under the logarithm. Therefore in the case 
of mixed continuous and discrete stimuli we expect the same term, after integration on 
to give the logarithm of a coefficient depending on the ratio between N and a 2 , which grows 
to infinity when o —>■ 0. Let us extract and calculate the term with s — s' out of eq.([55|): 

J d$'p{$')p{s) J] e -R^V)-*(<M] W (56) 
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It is clear that when a becomes small the major contribution to the integral comes from the 
values •&' ~ d. Therefore we expand the difference fji($', s) — r/i($, s): 

s) - m *) - dv f'; s) - *) = t e * sin ^ - <sW - (57) 

where we have explicitely used the expressions (fT^j) . (|2UJl . 

In the limit when a — > the resulting gaussian distribution is approximated with a 5- 
function and the integral on dd' can be performed. Extracting this contribution out of the 
logarithm the information can be rewritten as follows: 



i({ni}, •& ® s) ~ iog 2 



P^/tt v Nvj 

71" 



a 



+ E/ ddp{s,d) /^log 2 



E(^) 2 sin 2 (^-< s 



In 



1 + 



jv 



(5? 



The quenched average in the second term can be performed in the thermodynamic limit 
letting the average pass the logarithm in a mean field approximation. The third term behaves 
like log 2 [l + A], where A vanishes like VN /<rexp(— N/a 2 ) when a — > 0. The information in 
the leading term can be finally expressed in the following form: 



/({??;}, $® S) ~log 2 



(59) 



{ V2 <r V 2 

When the number of neurons grows to infinity and/or the noise tends to zero the infor- 
mation grows logarithmically to infinity. Notice that the case of purely continuous stimuli 
can be retrieved putting p = 1: not surprisingly the continuous dimension plays a major role 
in determining the asymptotic growth of the information to infinity, with a relatively mild 
modulation according to the number p of discrete correlates. 

We turn now to the case of the thresholded-gaussian model, eq. (J29)) . As we have shown 
previously, the mutual information can be expressed as in eq.([38p. In the limit of small a 
we apply the approximation (|54p. After a rearrangement of the terms we obtain: 
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Out of this expression we aim at keeping the leading terms, which diverge when a goes to 
zero and determine the asymptotic behaviour. In first approximation we will neglect terms 
going to zero with a, which would play a role in the first correction to the asymptotic value. 
As already done in the case of the of the gaussian model we consider the term with s' = s 
in the discrete sum under the logarithm. Looking at the function to be integrated on •&' it 
is easy to see that the product of the k exponentials is maximal for values of •&' close to 
17, while each one of the other N — k factors containing error functions is maximal for cell 
and stimulus specific values of namely the ones corresponding to the smallest values of 
fji^', s). Thus the main contribution for the integral comes from the values of d 1 close to d 
and we can repeat the procedure as in eqs.(|57 )l .(|5S )l . obtaining: 
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s'^s J \i=l 
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i=fc+l 



(61) 



Performing the sums on k and taking the limit of small a it is easy to show that all terms 
simplify or vanish with higher orders in a, except for the second and third term in the sum 
on k. The third term can be evaluated in a mean field approximation passing the quenched 
average under the logarithm: since N is very large k is also large for most values. The 
leading asymptotic term finally reads: 



I({r)i},$®s) 



log2 [-vf—y— 

f 



+l E/ s) £ I ^ I (1 - erf(fc(tf, log 2 i^^-^j ; (62) 

Since the first term is exactly the information for the gaussian model, eq. (j59|) . we see that 
already at the leading term, the rise to infinity with N is slightly higher for the thresholded 
gaussian model. It is quite difficult to extract analytically the exact dependence on N and 
a from eq.([62|). We have evaluated the sum on k numerically by means of a MATLAB code. 

Fig. Q (a) shows the mutual information for both models as a function of the population 
size. It must be said that while eq.([59|) is valid for generic values of the parameters, pro- 
vided the noise is small, more restrictive assumptions underlie the derivation of eq.(|fi2|). In 
particular one has to exclude the values for which the tuning curve ^(t?, s) is identically zero 
(namely a = and e\ = 1 with a finite weight in eq. (jl9j) ): the reason is that for very small 
values of the noise the weight of the 5 peak in the distribution ()29|) becomes proportional to 
5(r/j(-(9, s)), and several terms in eq. lJoTj) cannot be neglected any more. 

Moreover, for any fixed value of there is an upper bound on the value of the noise, 
beyond which the approximation (|5T)|) must be integrated by the neglected terms. This can 
be seen by direct analytical evaluation of each term. 

As an example we show in fig.(jnj)(b) this effect for a population of 5 cells: for values of 
a close to 0.05 the decrease in the information for the thresholded gaussian model starts 
slowing down, and for higher values of the noise the information would even increase. 

25 



(a) 




- gauss 

- Ih-gauss | 




90 100 



(b) 



FIG. 3: Information for the gaussian vs thresholded gaussian model with mixed continuous and 
discrete stimuli, according to eas.l|59 |) .(|62| ) . rjo = 1; p = 4;a = 0.2; e\ can take values 3/10,6/10 
and 9/10 with equal probability, (a) Asymptotic behaviour as a function of the population size N 
for a = 0.01. (b)As a function of the noise a for a population size of N = 5 cells. 

IV. DISCUSSION 



We have presented a detailed analysis of the mutual information carried by one population 
of independent units about a set of stimuli, examining both the case where the stimuli 
are purely discrete and the case where they are characterized by an additional continuous 
angular dimension. In fact, even though in real experiments performed on trained animals 
the stimuli always vary in a discrete set, since even continuous dimensions are sampled on 
a finite number of points, in nature the real brain must learn how to discriminate highly 
dimensional stimuli or behavioural correlates, whose dimensions may equally be continuous 
or discrete. For our specific model, we have been inspired by data recorded in the motor 
cortex of monkeys performing arm movements which might be parameterized according to a 
(continuous) direction and to a (discrete) "type" We have focused on two possible 

limits, namely the limit of finite population size and large noise, which corresponds to the 
initial information rise, and the asymptotic regime of large numbers of neurons and small 
noise. The limit of large noise has been recently studied in (?| by means of the replica trick. 
Here we have shown that regardless of the structure of the stimulus whether continuous or 
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discrete, the same results can be obtained without resorting to the replica trick, by a mere 
expansion of the logarithm. Moreover we have shown that correlations introduced in the 
preferred orientations of each neuron, across different values of the discrete parameter, can 
increase the redundancy, depressing the information. This issue is biologically relevant since 
in several cortical areas neurons show a tuning for the direction, and in particular in the 
data set analyzed in such correlations were indeed observed. 

Modifying accordingly the theoretical model did improve the fit of real information curves, 
suggesting that the correlations detected in the data are information bearing- here, in a 
negative sense: they depress the information content. 

We have been able to study analytically the asymptotic approach to the upper information 
bound with purely discrete stimuli, always without replicas, calculating both the leading term 
order a N and the correction order a 2N . We have shown how to retrieve our results using 
a different approach 9], always replica free. This approach does allows to go beyond our 
original approximation. Yet, in presence of additional quenched disorder as in the case of our 
specific model, further assumptions are necessary to proceed with the analytical evaluation 
of the correction. A careful analysis is still in progress and will be presented elsewhere \^\ - 

Finally we have evaluated the asymptotic information value in presence of mixed contin- 
uous and discrete stimuli, which grows to infinity when the noise goes to zero and/or the 
number of neurons becomes large. We have found that the information grows to infinity 
logarithmically with iV and with the inverse of the noise for the gaussian model, while the 
exact dependence for the thresholded gaussian model is more difficult to detect. Under cer- 
tain conditions and for very low values of the noise the asymptotic information is higher for 
the thresholded gaussian model than for the pure gaussian. 

This result is quite interesting per se, but we refrain from any speculation leaving its inter- 
pretation to a more careful evaluation of the information. In particular it will be interesting 
to evaluate the impact of the corrections neglected in eq. lJoT]) 

It must be said that the validity of our results is not checked here by means of numerical 
—on, In a prev.ons wo rk £ we nave shown and dlS ens S ed in detan that, s,„ee infot- 
mation is a very sensitive measure to limited sampling, in our specific case of a continuous 
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rate model with additional quenched parameters, the numerical evaluation results extremely 
hard, especially in the most interesting limit of large population sizes. The simulations pre- 

using a decoding procedure which is meant to reduce the 
bias. Even so, the agreement with the analytical results was found only for a population 
size of maximum 2 cells, the curve deviating due to the distortion caused by decoding for 
larger population sizes. We are currently working in order to improve the numerical tech- 
niques and obtain a better check of our analytical results for most of the parameter space. 
Nonetheless, we think that the widely established difficulty in getting accurate numerical 
information estimates with models of this type makes our analytical efforts and the results 
presented here even more remarkable. 
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APPENDIX A: THE SMALL a LIMIT IN PRESENCE OF PURELY DISCRETE 
STIMULI 

Let us reconsider eq. (|43)l . Inserting the expansion (jUj) one obtains: 



where I have used the fact that quenched disorder is uncorrelated and identically distributed 
across neurons and stimuli. 

When a becomes very small one can use the following approximation: 




(Al) 



V2na 2 5 (77 s - rf 1 ) ; 



(A2) 
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Let us reconsider the term with k = 1 in eq. ljAlj) : 



X; (V^af \J d V s dr ] s ^(s)0(s 1 )S(r ] s -r l ^) 
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(A3) 



Therefore this term gives a contribution of order a . 

It is easy to check that at each order k the term with s\ = S2-- = Sk 7^ s gives a 
contribution of order a N : in fact while for a generic choice of Si,S2--Sfc out of the p correlates 
one has finally several 5-functions each of which, according to eq. (jA2|) . carries a factor a, 
when all the stimuli are equal only one 5-function remains and the result is of order a . 
Therefore one has to sum all the contributions to calculate the exact coefficient determining 
the asymptotic approach of the information to the upper bound. For a generic order k in 
the expansion of the logarithm one has: 
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(A4) 



It is now clear that to obtain the contribution at all orders k one must sum the series: 

^1 - 2^ 
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(A5) 



and the final contribution to the mutual information up to order a N can be expressed as 
follows: 

(p - 1) Sx (V2^ah) N ; h = J d V e 2 ( V ); (A6) 



Inserting eq. (|A6j) in eq. (|Al|) one obtains the expression of the asymptotic approach to the 
ceiling up to order a N : 
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This result can be easily extended to higher powers of a. The first correction of order a 2N 
to the leading term can be calculated through the same technique. Since at the k th order in 
the expansion of the logarithm the factor a N was obtained considering only the configuration 
where all the stimuli Si..Sk are equal, it is clear that each configuration where all the stimuli 
except one are equal will generate a factor a 2N ; in fact if, say, I stimuli are different from one 
another among the k, one will have to introduce a 5-function for each of the I exponentials, 
according to eq. (jA2|) . Let us see in detail the k th order contribution, assuming for example 
that all stimuli are equal except s±: 
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A 5-function is introduced according to eq. (jA2|) for each of the two exponentials in eq. (jA8|) ; 
since one has k possible choices for the stimulus which is different from the other k — 1, the 
final result must be multiplied times a factor k more; finally ne has: 

( ^_ ; i)JV/2 (p-l)(p-2) (2na 2 ) [j d^g 3 ^ ; (A9) 

Summing all terms at any order k the total contribution order a 2N to the mutual information 
can be written as follows: 
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Inserting this result in eq. (jA7|) one obtains the final expression for the mutual information 
up to order a 2N , eq.gSJ). 



APPENDIX B: LARGE a LIMIT FOR THE THRESHOLDED-GAUSSIAN 
MODEL 

Let us reconsider eqs. (pU|) . Developing all the products one can expand the logarithm in 
powers of 1/er: 
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where we have kept only the terms which will give a contribution order 1/cr 2 to the informa- 
tion. 

This expression has to be inserted in eq.([3H|) and integrated on {r]i}. One obtains: 
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where we have assumed that quenched disorder is uncorrelated across neurons. 

It must be noticed that the terms of order TV 2 appearing in the expansion of the logarithm, 
eq. (jBl|) do not appear in eq. (jB2|) any more: it can be easily shown that after averaging across 
the stimuli they cancel out. 

The sums on k in eq. (jB2|) can be performed and the result is then inserted in the expression 
for the mutual information, eq. (|H8J) . which finally can be rewritten as follows: 



J({7fc},#® s) = E / dtip(9,8 
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N 12 /rj(ti',s') 



N [l-evims)/a)]) - t 1 - (W «)/^)] )(B3 

In 2 V vr \ a / £)i?0 7r In 2 \ a 2 / £ i?0 

Using the expansion (J3*§j) for the error function and keeping only the terms up to order N/ cr 2 
one arrives at eq.(jHJ). 
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